Skip to content

feat: supported multiple rows update for sharding key update#1109

Open
yogesh1801 wants to merge 5 commits into
pgdogdev:mainfrom
yogesh1801:feat-multi-row-update
Open

feat: supported multiple rows update for sharding key update#1109
yogesh1801 wants to merge 5 commits into
pgdogdev:mainfrom
yogesh1801:feat-multi-row-update

Conversation

@yogesh1801

Copy link
Copy Markdown
Contributor

No description provided.

@levkk

levkk commented Jun 26, 2026

Copy link
Copy Markdown
Collaborator

Apparently it was easy. Could you include some test coverage?

@yogesh1801

yogesh1801 commented Jun 26, 2026

Copy link
Copy Markdown
Contributor Author

@levkk 2 problems what i face is

  1. perf regression on insert rows, earlier if original shard == final_shard then directly run update on the shard, but now since we expect multiple rows this is not possible as UPDATE uses complete WHERE clause from the origin query so now even if same shard both INSERT and DELETE Runs, so there is a tradeoff

  2. For tests i was thinking to run something like UPDATE SET id = id + 10 WHERE id IN ( 1, 2 ) but looks like id + 10 type expressions are not supported, if i do something like id = 11 it gives primary key violation so for tests should i remove PK constraint from the table for just this test.

@levkk

levkk commented Jun 26, 2026

Copy link
Copy Markdown
Collaborator

Regression on insert rows, earlier if original shard == final_shard then directly run update on the shard, but now since we expect multiple rows this is not possible as UPDATE uses complete WHERE clause from the origin query so now even if same shard both INSERT and DELETE Runs, so there is a tradeoff

You should still be able to validate that each row matches the original shard - just loop over the results and perform the same check we do for one row. The only difference is all checks have to pass, or we fallback to the insert/delete workflow.

For tests i was thinking to run something like UPDATE SET id = id + 10 WHERE id IN ( 1, 2 ) but looks like id + 10 type expressions are not supported, if i do something like id = 11 it gives primary key violation so for tests should i remove PK constraint from the table for just this test.

You can do this instead:

UPDATE ... SET id = 5 WHERE true;

(it was 10 before, or an ID that matches a different shard actually).

@yogesh1801

Copy link
Copy Markdown
Contributor Author

You should still be able to validate that each row matches the original shard - just loop over the results and perform the same check we do for one row. The only difference is all checks have to pass, or we fallback to the insert/delete workflow.

Thought of this but how likely is this case that all of them belong to same bag?, though there is no harm in checking all rows and see if they all match the same shard since it will be a quick check.

@levkk

levkk commented Jun 26, 2026

Copy link
Copy Markdown
Collaborator

Thought of this but how likely is this case that all of them belong to same bag?, though there is no harm in checking all rows and see if they all match the same shard since it will be a quick check.

Could be high, e.g.:

UPDATE users SET tenant_id = 7 WHERE user_id IN ($1);

That's going to return a lot of rows but if they belong to the same tenant in the first place, and the new tenant_id is on the same shard, we could save ourselves a lot of work.

@yogesh1801

Copy link
Copy Markdown
Contributor Author

@levkk have added the above suggestion could you review

@yogesh1801

yogesh1801 commented Jun 26, 2026

Copy link
Copy Markdown
Contributor Author

not sure, why tests failing

Ok they passed :)

@codecov

codecov Bot commented Jun 26, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 87.50000% with 7 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
.../frontend/client/query_engine/multi_step/update.rs 87.50% 7 Missing ⚠️

📢 Thoughts on this report? Let us know!

@yogesh1801

Copy link
Copy Markdown
Contributor Author

@levkk is more test coverage needed here?

@levkk

levkk commented Jun 30, 2026

Copy link
Copy Markdown
Collaborator

Ah yes, please! Definitely integration tests are important for this one. Unit tests are too "artificial" - we need to throw some randomness at this and make sure it still works as expected.

@yogesh1801

Copy link
Copy Markdown
Contributor Author

@levkk have added some tests
update_multiple_rows_from_same_shard - from same shard we have multiple rows which goes to another shard
update_multiple_rows_goes_to_same_shard - all the rows go to same shard ( fast track )
update_rows_from_different_shards_to_same_shard - rows from different shards goes to same shard

what more scenarios should be covered?

@levkk

levkk commented Jul 1, 2026

Copy link
Copy Markdown
Collaborator

Cool! One last cargo fmt and we should be good to go here.

@levkk levkk requested a review from meskill July 1, 2026 16:40
@yogesh1801

yogesh1801 commented Jul 1, 2026

Copy link
Copy Markdown
Contributor Author

Have fixed fmt errors, one thing i see is that tests are all for happy paths, maybe need to check for failures during insert or delete

@levkk

levkk commented Jul 1, 2026

Copy link
Copy Markdown
Collaborator

That would be great

@yogesh1801

Copy link
Copy Markdown
Contributor Author

@levkk was trying to come up with failure scenario where i could think of places where DELETE or INSERT failure but i think both of then are part of commit tests like 2pc

Should we cover thesr failures here too?

@levkk

levkk commented Jul 2, 2026

Copy link
Copy Markdown
Collaborator

That would be great. I will never say no to a test :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants